Tony & Friends in Kelloggs Land - Promotional Game Platformer 90s MSDOS VGA Good/Decent Graphics & Music Dosbox; cycles=15000 or cycles=max some issues with screen performance sometimes game is playable (not fully tested) Game is in German, Does not really affect gameplay
Topics: promo, promogame, free, kellogg, kellogg´s, kelloggs, msdos, dos, platformer, jump, scroller, sb,...
This file is a snapshot dump of the Crossref DOI metadata API, containing entries for over 94 million DOIs. Compared to the previous 2017-03 version (see archive.org item "crossref_doi_dump_201703"), this snapshot has a few million more works, but the corpus size is much larger (29 GB compressed vs. 7 GB compressed) as it now contains significantly more citation data, due to the efforts of the Initiative for Open Citations (I4OC) project. This was generated by running the scripts...
17.6M
18M
Dec 19, 2017
12/17
Dec 19, 2017
A series of open web crawls targeting journal articles, technical memos, essays, datasets, and other research publications. This collection contains WARC and CDX files that end up in Wayback ( https://web.archive.org ). See also bibliographic metadata corpuses at https://archive.org/details/ia_biblio_metadata
Internet Archive crawldata from Survey Webwide Crawl, captured by crawl339.us.archive.org:survey from Wed Sep 13 11:45:12 PDT 2017 to Fri Sep 15 20:22:07 PDT 2017.
Topic: crawldata
743
743
Aug 25, 2017
08/17
Aug 25, 2017
image
eye 743
favorite 2
comment 0
Favicons are the (usually tiny) image files that browsers may use to represent websites in tabs, in the URL bar, or for bookmarks. This dataset contains about 360,000 favicons from popular websites. These favicons were scraped in July 2016. I wrote a crawler that went through Alexa's top 1 million sites, and made a request for 'favicon.ico' at the site root. If I got a 200 response code, I saved the result as ${site_url}.ico. For domains that were identical but for the TLD (e.g. google.com,...
Topics: images, icons, internet
14,606
15K
Aug 15, 2017
08/17
Aug 15, 2017
software
eye 14,606
favorite 19
comment 1
RollerCoaster Tycoon 3 Deluxe Edition (Europe): RollerCoaster Tycoon 3 RollerCoaster Tycoon 3: Wild! RollerCoaster Tycoon 3: Soaked!
Topic: RollerCoaster Tycoon 3
270
270
May 3, 2017
05/17
May 3, 2017
software
eye 270
favorite 2
comment 0
Play the part of the Evil Overlord as you make your way through the land, defeating Heroes and bringing Doom with you as you go! Awesome game, fun to play--hysterical narration, be sure to listen closely :)
Topic: Dungeon Keeper Gold game play fun funny
2017 Archive.org Census Identifiers
74,063
74K
Nov 20, 2016
11/16
Nov 20, 2016
web
eye 74,063
favorite 1
comment 0
home.arcor.de is going to be closed on January 31th 2017. This item contains a best-effort grab of the user’s sites. Each WARC was seeded with 1000 users and contains all assets required to display the sites (span-hosts).
Topics: arcor, isp hosting, archiveteam
1,075
1.1K
Sep 15, 2016
09/16
Sep 15, 2016
software
eye 1,075
favorite 1
comment 0
French copy of the video game Mob Rule, also known as Street Wars and as Constructor Underworld.
Topics: studio 3, mob rule, constructor, street wars
302
302
May 14, 2016
05/16
May 14, 2016
software
eye 302
favorite 2
comment 0
http://archiveteam.org/index.php?title=Internet_Archive_Census An unofficial attempt to count and account for the files available on the Internet Archive, both directly downloadable, public files and private files that are available through interfaces like the Wayback Machine or the TV News Archive. The purpose of this project is multi-fold, including collections of the reported hashes of all the files, determination of sizes of various collections, and determining priorities in backing up...
Topic: IA Census
Source: torrent:urn:sha1:d5f9909f56f14867ca2e7a925cb1dadbb2a3da49
146
146
Mar 15, 2016
03/16
Mar 15, 2016
data
eye 146
favorite 2
comment 0
Platforms DOS, Windows Published by Blue Byte Software GmbH Released 1997 Genre Compilation Description The Settlers II (Gold Edition) contains: The Settlers II: Veni, Vidi, Vici The Settlers II Mission CD A full world atlas Contest entries of 130 fan-made custom maps From Mobygames.com. Original Entry
Topics: msdos, game
51,097
51K
Jul 9, 2015
07/15
favoritefavoritefavoritefavoritefavorite
Jul 9, 2015
data
eye 51,097
favorite 9
comment 3
favoritefavoritefavoritefavoritefavorite
Find the dataset available for instant analysis in BigQuery and queries on this reddit...
(Here is the original Reddit comment announcing this collection of data and what the processes were.) This is an archive of Reddit comments from October of 2007 until May of 2015 (complete month). This reflects 14 months of work and a lot of API calls. This dataset includes nearly every publicly available Reddit comment. Approximately 350,000 comments out of ~1.65 billion were unavailable due to Reddit API issues. Q: How are the files structured? Each file is compressed with bzip2 compression....
Find the dataset available for instant analysis in BigQuery and queries on this reddit...
1,499
1.5K
Apr 1, 2014
04/14
Apr 1, 2014
software
eye 1,499
favorite 1
comment 0
All the "journal article" DOIs from CrossRef's OAI-PMH server; URLs of just under 50 million journal articles.
Topics: doi, dataset
This is a panic grab of http://archive.is/alldomains .
Topics: archiveteam, archive.is, panicgrab
7,818
7.8K
Dec 7, 2013
12/13
Dec 7, 2013
software
eye 7,818
favorite 5
comment 1
Sim City 2000 v1.0 (1994)(Maxis)
upload 420,379
Jason Scott
archivist for 10 years
Apr 1, 2011
04/11
archive.org account
person
upload 420,379
comment 34
favorite 8
419,805
420K
Jun 19, 2013
06/13
Jun 19, 2013
The Dataset Collection consists of large data archives from both sites and individuals.
2,504
2.5K
May 14, 2013
05/13
May 14, 2013
Abstract While playing around with the Nmap Scripting Engine (NSE) we discovered an amazing number of open embedded devices on the Internet. Many of them are based on Linux and allow login to standard BusyBox with empty or default credentials. We used these devices to build a distributed port scanner to scan all IPv4 addresses. These scans include service probes for the most common ports, ICMP ping, reverse DNS and SYN scans. We analyzed some of the data to get an estimation of the IP address...
8.2B
8.2B
Nov 17, 2012
11/12
Nov 17, 2012
Survey crawls are run about twice a year, on average, and attempt to capture the content of the front page of every web host ever seen by the Internet Archive since 1996.
Topic: survey crawls
37.8M
38M
Jul 11, 2012
07/12
Jul 11, 2012
Crawl data donated by Alexa Internet. This data is currently not publicly accessible
59
59
Jun 11, 2012
06/12
Jun 11, 2012
audio
eye 59
favorite 1
comment 0
0:00 ZX Spectrum Orchestra - Beepulator 3:41 Haus Arafna - Last Dream Of Jesus 7:31 Fe-Mail featuring Lasse Marhaug - Charmed 13:10 Cloverleaf - Anhedonia 15:20 :zoviet*france: - Angel's Pin Number 20:57 Grails - Black Tar Prophecy 28:10 Sister Iodine - Whitebread Column 30:54 OOIOO - UJA 38:39 Jazkamer - Metal Music Machine 42:49 Byrd E. Bath - Good Old Fashioned Balls
936M
936M
Apr 4, 2012
04/12
Apr 4, 2012
Wayback indexes. This data is currently not publicly accessible.
127.1M
127M
Mar 31, 2012
03/12
Mar 31, 2012
Web crawl data from Common Crawl.
12.7M
13M
Feb 15, 2012
02/12
Feb 15, 2012
Take a step back in time and revisit your favorite DOS and Windows games. The files available in this collection consist primarily of PC demos, freeware, and shareware. These files are the original releases which will require intermediate to advanced knowledge to install and run on modern operating systems. Where possible online play is enabled to enjoy the game directly in your browser. New files are added to this collection on a regular basis. Specific news regarding major updates...
Topics: PC Games, Vintage computer games, Windows games, DOS games
2.1B
2.1B
Nov 29, 2011
11/11
Nov 29, 2011
A daily collection of thousands of the most popular web sites according to Alexa.com's top sites rankings .
Topics: daily, popular sites, Alexa
3.2B
3.2B
Nov 4, 2011
11/11
Nov 4, 2011
Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
Topic: webcrawl
1.2M
1.2M
Aug 2, 2011
08/11
Aug 2, 2011
WikiTeam software is a set of tools for archiving wikis. They work on MediaWiki wikis, but we want to expand to other wiki engines. As of January 2020, WikiTeam has preserved more than 250,000 wikis , several wikifarms, regular Wikipedia dumps and 34 TB of Wikimedia Commons images . About WikiTeam There are thousands of wikis in the Internet. Every day some of them are no longer publicly available and, due to lack of backups, lost forever. Millions of people download tons of media files...
Topic: wikis
1.4B
1.4B
May 13, 2011
05/11
May 13, 2011
Crawl of outlinks from wikipedia.org . These files are currently not publicly accessible. from Wikipedia : Wikipedia is a multilingual, web-based, free-content encyclopedia project operated by the Wikimedia Foundation and based on an openly editable model. The name "Wikipedia" is a portmanteau of the words wiki (a technology for creating collaborative websites, from the Hawaiian word wiki, meaning "quick") and encyclopedia. Wikipedia's articles provide links to guide the...
7.8B
7.8B
Apr 26, 2011
04/11
Apr 26, 2011
Content crawled via the Wayback Machine Live Proxy mostly by the Save Page Now feature on web.archive.org. Liveweb proxy is a component of Internet Archive’s wayback machine project. The liveweb proxy captures the content of a web page in real time, archives it into a ARC or WARC file and returns the ARC/WARC record back to the wayback machine to process. The recorded ARC/WARC file becomes part of the wayback machine in due course of time.
1.4B
1.4B
Apr 8, 2011
04/11
Apr 8, 2011
Large-scale web harvests and national domain crawls performed for National Libraries, National Archives, preservation partners, research initiatives, and as part of special projects and custom crawling and research services.
Topic: ccs
1,629
1.6K
Mar 15, 2011
03/11
Mar 15, 2011
12.4B
12B
Nov 16, 2010
11/10
Nov 16, 2010
Starting in 1996, Alexa Internet has been donating their crawl data to the Internet Archive. Flowing in every day, these data are added to the Wayback Machine after an embargo period.
Topics: web crawl, Alexa
37,254
37K
Oct 18, 2010
10/10
Oct 18, 2010
photos and video from Internet Archive events
Topic: internetarchivedump
10.6M
11M
Oct 11, 2010
10/10
Oct 11, 2010
Presentations and events at the Internet Archive.
Topic: collection
63B
63B
Oct 8, 2010
10/10
Oct 8, 2010
The Web Archive of the Internet Archive started in late 1996, is made available through the Wayback Machine , and some collections are available in bulk to researchers. Many pages are archived by the Internet Archive for other contributors including partners of Archive-IT , and Save Page Now users. Other captures are donated to the Internet Archive by other partners such as Alexa Internet .
Topic: Web Archive
13.4B
13B
Oct 5, 2010
10/10
Oct 5, 2010
Wide crawls of the Internet conducted by Internet Archive. Please visit the Wayback Machine to explore archived web sites. Since September 10th, 2010, the Internet Archive has been running Worldwide Web Crawls of the global web, capturing web elements, pages, sites and parts of sites. Each Worldwide Web Crawl was initiated from one or more lists of URLs that are known as "Seed Lists". Descriptions of the Seed Lists associated with each crawl may be provided as part of the metadata for...
14,671
15K
Jun 16, 2010
06/10
Jun 16, 2010
Server logs from archive.org. Usage logs, from the webservers of the Internet Archive and the Wayback Machine.
Topic: webserverlogs
32.4B
32B
Jun 11, 2010
06/10
Jun 11, 2010
The Internet Archive discovers and captures web pages through many different web crawls. At any given time several distinct crawls are running, some for months, and some every day or longer. View the web archive through the Wayback Machine .
Topic: webwidecrawl
509,256
509K
Oct 8, 2009
10/09
Oct 8, 2009
301works.org 301Works.org is an independent service for archiving URL mappings. The goal of the service is to provide protection for every day users of short URL services by providing transparency and permanence of their mappings. Shortened URL archives are in accordance with 301Works.org membership terms. Items contained in the archives are not publicly accessible at this time. 301Works Frequently Asked Questions
Topic: 301works